# Spatial Relationship Description
Rgb Language Cap
Apache-2.0
This is a vision-language model trained on the COCO dataset, capable of generating descriptive texts that include spatial relationships between image entities.
Image-to-Text
Transformers English

R
voxreality
24
0
Rgb Language Cap
MIT
This is a spatially-aware vision-language model capable of recognizing spatial relationships between objects in images and generating descriptive text.
Image-to-Text
Transformers English

R
sadassa17
15
0
Featured Recommended AI Models